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In this article we give several new results on the complexity of algorithms that learn Boolean 
functions from quantum queries and quantum examples. 

• Hunziker et al. [l^ conjectured that for any class C of Boolean functions, the number of 
quantum black-box queries which are required to exactly identify an unknown function from C 

IS 

C)(l£si£i), where is a combinatorial parameter of the class C. We essentially resolve this 

conjecture in the affirmative by giving a quantum algorithm that, for any class C, identifies any 
unknown function from C using Q('°g 1^1 i°gipg \c\ -j quantum black-box queries. 

• We consider a range of natural problems intermediate between the exact learning problem (in 
which the learner must obtain all bits of information about the black-box function) and the usual 
problem of computing a predicate (in which the learner must obtain only one bit of information 
about the black-box function). We give positive and negative results on when the quantum and 
classical query complexities of these intermediate problems are polynomially related to each other. 

• Finally, we improve the known lower bounds on the number of quantum examples (as op- 
posed to quantum black-box queries) required for (e, 5)-PAC learning any concept class of Vapnik- 
Chervonenkis dimension d over the domain {0, 1}" from n(-^) to f2(i log ^ + d + ■^). This new 
lower bound comes closer to matching known upper bounds for classical PAC learning. 
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I. INTRODUCTION 



A. Motivation and Background 

A major focus of study in quantum computation is the power of quantum algorithms to extract information from 
a "black-box" oracle for an unknown Boolean function. Many of the most powerful ideas for both algorithmic results 
and lower bounds in quantum computing have emerged from this framework, which has been studied for more than 
a decade. 

The most frequently considered problem in this setting is to determine whether or not the black-box oracle (which is 
typically assumed to belongto some particular a priori fixed class C of possible functions) has some specific property, 
such as being identically , being exactly balanced between outputs and 1 j or being invariant under an 

XOR mask |2^]. However, as described below researchers have also studied several other problems in which the goal 
is to obtain more than just one bit of information about the target black-box function: 

Quantum exact learning from membership queries: Servedio and Gortler j2Qj initiated a systematic study of 
the quantum black-box query complexity required to exactly learn any unknown function c from a class C of 
Boolean functions. This is a natural quantum analogue of the standard classical model of exact learning from 
membership queries which was introduced in computational learning theory by Angluin |2| ■ This quantum exact 
learning model was also studied by Hunziker et al. jT^ and by Ambainis et al. 1], who gave a general upper 
bound on the quantum query complexity of learning any class C. 
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PAC learning from quantum examples: In another line of related research, Bshouty and Jackson '9*1 introduced 
a natural quantum analogue of Valiant's well-known Probably Approximately Correct (PAC) model of Boolean 
function learning ^23] which is widely studied in computational learning theory. subsequently gave a n{d/n) 
lower bound on the number of quantum examples required for any PAC learning algorithm for any class C of 
Boolean functions over {0, 1}" which has Vapnik-Chervonenkis dimension d. 



In this paper we study three natural problems of quantum learning: (i) exact learning from quantum membership 
queries as described above; (ii) learning a partition of a class of functions from quantum membership queries (this is 
an intermediate problem between the quantum exact learning problem and the well-studied problem of obtaining a 
single bit of information about the target function), and (iii) quantum PAC learning as described above. For each of 
these problems we give new bounds on the number of quantum queries or examples that are required for learning. 

For the quantum exact learning model, Hunziker et al. conjectured that for any class C of Boolean functions, 
the number of quantum black-box queries that are required to exactly learn an unknown function from C is 0{ ), 



where 7*^ (defined in Section IIII A|l is a combinatorial parameter of the class C . We give a new quantum exact 
learning algorithm based on a multi-target Grover search on a prescribed subset of the inputs, and show that the 



a loglog|C| factor. Our new bound is incomparable with the upper bound of Ambainis et al. [j], but as we show 
it improves on this bound for a wide range of parameter settings. We also show that for every class C of Boolean 
functions, the query complexity of our generic algorithm is guaranteed to be at most a (roughly) quadratic factor 
worse than the query complexity of the best quantum algorithm for learning C (which may be tailored for the specific 
class C). 

For our second problem, we study a more general problem which is intermediate between learning the black-box 
function exactly and computing a single Boolean predicate of the unknown black-box function. This problem is the 
following: given a partition of a class C into disjoint subsets Pi, . . . , Pfc, determine which piece the unknown black-box 
function c G C belongs to. Ambainis et al. proposed the study of this problem as an interesting direction for future 
work in fj] . Note that the problem of computing a single Boolean predicate of an unknown function c G C corresponds 
to having a two-way partition, whereas the problem of exact learning corresponds to a partition of C into \C\ disjoint 
pieces. 

We show that for any concept class C and any partition size 2 < k < \C\, there is a partition of C into k pieces such 
that the classical and quantum query complexities are polynomially related. On the other hand, we also show that 
for a wide range of partition sizes k it is possible for the quantum and classical query complexities of learning a /c-way 
partition to have a superpolynomial separation. These results show that the structure of the partition plays a more 
important role than the size in determining the relationship between quantum and classical complexity of learning. 

Finally, for the quantum PAC learning model, we improve the ^2(^) lower bound of 20] on the number of quantum 
examples which are required to PAC learn any concept class of Vapnik-Chervoncnkis dimension d over {0, 1}". Our 
new bound of ri(i log j + d + ^) is not far from the known lower bound of Ehrcnfcucht et al. of ri(i log j + 7) 
for classical PAC learning. Since the lower bound of 11] is known to be nearly optimal for classical PAC learning 
algorithms (an upper bound of 0(i log | + 7 log 7) was given by [^), our new quantum lower bound is not far from 
being the best possible. 



Section IIIII gives our new quantum algorithm for exactly learning a black-box function. Section IIVI gives some 
simple examples and poses a question about the relation between query complexity of quantum and classical exact 
learning. Section W\ gives our results on the partition learning problem, and Section IVII gives our new lower bound 
on the sample complexity of quantum PAC learning. Section IVlII concludes with some additional open questions for 
further work. 



B. 



Our Results 




C. Organization 
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II. PRELIMINARIES 
A. Learning Preliminaries 

A concept c over {0, l}" is a Boolean function c : {0, 1}" — > {0, 1}. Equivalently we may view a concept as a subset 
of {0, 1}" defined by {x G {0, 1}" : c(x) — 1}. A concept class C — U„>iC„ is a set of concepts where C„ consists of 
those concepts in C whose domain is {0, 1}". For ease of notation throughout the paper we wiU omit the subscript 
in C„ and simply write C to denote a collection of concepts over {0, 1}". It will often be useful to think of C as a 
|C| X 2"-binary matrix where rows correspond to concepts c € C, columns correspond to inputs x G {0, 1}", and the 
entry of the matrix is the value of the i-th concept on the j-th input. 

We say that a concept class C is 1-sensitive if it has the property that for each input a;, at least half of all 
concepts c G C have c{x) = (i.e. each column of the matrix C is at most half ones). Given any C it is possible 
to convert it to an equivalent 1-sensitive concept class by flipping the value obtained from any input x which has 
|{c : c{x) = 1}| > |{c : c{x) = 0}|. This condition on x can simply be checked by enumerating all concepts c in 
C - without making any queries. In general, we refer to the process of flipping the matrix entries which reside in 
a particular subset of columns as performing a column flip. This notion of 1-sensitivity and a column flip was first 
introduced by JJ . 

It is important to note that achieving the effect of a column flip in our algorithms involves creating and using 
simulated oracles. In other words, a column flip affects not only the matrix corresponding to the set of candidate 
concepts C but also the result of classical and quantum membership queries. Therefore, after a column flip on the 
subset of inputs K, a membership query access to the target oracle at one of the inputs in K should be considered to 
be inverted before returned to the algorithm. As remarked in 0, in both the classical and quantum learning models 
this can be achieved via some additional circuitry which is not significant for our purposes, since we are only interested 
in the query complexity. 

B. Classical Learning Models 

The classical model of exact learning from membership queries was introduced by Angluin and has since been 
studied by several authors 0, ITsL ITsl lla | . In this framework, a learning algorithm for C is given query access to a 
black-box oracle MQc for the unknown target concept c G C, i.e. when the learner provides x G {0, 1}" to MQc she 
receives back the value c{x). A learning algorithm is said to be an exact learning algorithm for concept class C if the 
following holds: for any c G C, with probability at least 2/3 the learning algorithm outputs a Boolean circuit h which 
is logically equivalent to c. (Note that a learning algorithm for C "knows" the class C but does not know the identity 
of the target concept c G C.) The query complexity of a learning algorithm is the number of queries that it makes to 
MQc before outputting h. We will be chiefly concerned in this paper with a quantum version of the exact learning 
model, which we describe in Section Hill 

In the classical PAC (Probably Approximately Correct) learning model, which was introduced by Valiant [2^ and 
subsequently studied by many authors, the learning algorithm has access to a random example oracle EX{c,'D) where 
c e C is the unknown target concept and V is an unknown probability distribution over {0, 1}". At each invocation 
the oracle EX{c, V) (which takes no inputs) outputs a labeled example (a;, c{x)) where x G {0, 1}" is drawn from the 
distribution T). An algorithm A is a PAC learning algorithm for concept class C if the following condition holds: given 
any e, i5 > 0, for all c g C and all distributions V over {0, l}", if A is given e, 5 and is given access to EX{c, V) then 
with probability at least 1 — 5 the output of A is a Boolean circuit h : {0, 1}" — > {0, 1} (called a hypothesis) which 
satisfies Y'ix£'D[h{x) ^ c{x)\ < e. The (classical) sample complexity of A is the maximum number of calls to EX{c^'D) 
which it makes for any c G C and any distribution T). In Section IVII we will study a quantum version of the PAC 
learning model. 

III. EXACT LEARNING WITH QUANTUM MEMBERSHIP QUERIES 

Given any concept c : {0, !}"■ — *■ {0,1}, the quantum membership oracle QMQc is the transformation 
which acts on the computational basis states by mapping \x,b) i-^ \x,b (B c{x)) where x G {0,1}" and b G 
{0, 1}. A quantum exact learning algorithm for a concept class C is a sequence of unitary transformations 
C/q, QMQc, Ui, QMQc, ■ ■ ■ , QMQc, Ut where each Ui is a fixed unitary transformation without any dependence on 
c. The algorithm must satisfy the following property: for any target concept c G C which is used to instantiate the 
QMQ queries, a measurement performed on the final state will with probability at least 2/3 yield a representation 
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of a (classical) Boolean circuit h : {0,1}" {0,1} such that h{x) = c{x) for all x G {0,1}". The quantum query 
complexity of the algorithm is T, the number of invocations of QMQ^- 

Note that a quantum membership oracle QMQr is identical to the notion of "a quantum black-box oracle for c" 
which has been widely studied in e.g. 0, hj and many other works. Most of this work, however, focuses on the 
quantum query complexity of computing a single bit of information about the unknown oracle, e.g. the OR of all its 
output values )J% or the parity of all its output values I l2l |. The quantum exact learning problem which we consider 
in this section was proposed in (23| and later studied in Ill| (where it is called the "oracle identification problem") and 
in [13. 

Throughout the paper we write R{C) to denote the minimum query complexity of any classical (randomized) exact 
learning algorithm for concept class C. We write Q{C) to denote the minimum query complexity of any quantum 
exact learning algorithm for C. We write N to denote 2", the number of elements in the domain of each c G C. 

In Section IlII Al we briefly recap known bounds on the query complexity of quantum and classical exact learning 
algorithms. In Section IlII Bl we give our new quantum learning algorithm, prove correctness, and analyze its query 
complexity. 

A. Known bounds on query complexity for exact learning 

We begin by defining a combinatorial parameter 7*^ of a concept class C which plays an important role in bounds 
on query complexity of exact learning algorithms. 

Definition III.l Let C be a concept class over {0,1}". We define 

7f' = min \{ceC' ■.c{a) = b}\/\C'\, where a G {0, 1}", C" C C 
6e{o,i} 

7<^' = max 7^' , where C" C C 

ae{0,l}" 

^ C . c'' 

7 = mm 7 . 

C'CC,\C'\>2 

If C" C C is the set of possible remaining target concepts, then 7*-^ is the maximum fraction of C which a (classical) 
learning algorithm can be sure of eliminating with a single query. Thus, intuitively, the smaller 7*^ is the more 
membership queries should be required to learn C. 

The following lower and upper bounds on the query complexity of classical exact learning were established in [^ : 

Theorem III. 2 For any concept class C we have R{C) = f^(^) and R{C) — il(log|C|). 



Theorem III. 3 There is a classical exact learning algorithm which learns any concept class C using 0{ ) many 
queries, so consequently R{C) = 0( '°?1^^ ). 

A quantum analogue of this classical lower bound was obtained in |20j : 

Theorem III.4 For any concept class C over {0, 1}" we have Q{C) = and Q{C) = 



Given these results it is natural to seek a quantum analogue of the classical 0( '°?1^^ ) upper bound. Hunziker et 
al. 01 made the following conjecture: 

Conjecture III. 5 There is a quantum exact learning algorithm which learns any concept class C using 0(i2^i£i) 
quantum membership queries. 

In Section lill BI we prove this conjecture up to a log log \C\ factor. 

Hunziker et al. tl_7i | also conjectured that there is a quantum exact learning algorithm which learns any concept 
class C using 0(-\/|C|) queries. This was established by Ambainis et al. [3|, who also proved the following result: 



Theorem III. 6 There is a quantum exact learning algorithm which learns any concept class C with \C\ > N using 
0{y^N\og \C\ log A^log log |C|) many queries. 
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B. A New Quantum Exact Learning Algorithm 



We start with a simple yet useful observation: 

Lemma III. 7 For any concept class C, there exists an x G {0, 1}" for which at least a 7*^ fraction of concepts c G C 
satisfy c{x) =^ 1. More generally for every subset C C with \C'\ > 2, there exists an input x at which the fraction 
of concepts in C yielding 1 is at least 7*-^ (which is at least as large as )■ 

Proof: It is sufficient to prove the result in the latter general form. Consider any subset C' C C with \C'\ > 2. By 
Definition HTlTI we know: 

• 7'"' > 1^ ■ 

• At any input z G {0,1}", the fraction of concepts in C yielding 1 has to be at least 7^ . 

Now consider the input a which satisfies 7^ = 7*-^ : the fraction of concepts in C yielding 1 at input a should therefore 
be at least 7*-^ . Thus taking x = a gives the intended result. ■ 

The quantity t"*^ can be bounded as follows: 

Lemma III. 8 For any concept class C with \C\ > 2, < 7*^ < ^. This also implies < 7*^ ^ \ by 

Definition Ull.lX 

Proof: 7'-^ ^ 5 is clear from the Definition IIII.ll To prove the other direction we may assume that 7*^ < since 
otherwise the result is obviously true. Therefore \C\ > N must hold. Observe that at each input x one of the following 
must hold: 

• The fraction of concepts in C yielding at x is at least 1 — j'^ and thus the fraction of concepts in C yielding 
1 at X is at most 7*". 

• The fraction of concepts in C yielding 1 at x is at least 1 — 7*^ and thus the fraction of concepts in C yielding 
at a; is at most 7*^. 

Hence C can contain at most 7'^|C|7V concepts which are not identically equal to the concept Cmaj defined as follows: 



Jo, if at least half of the concepts in C yield at x; 
I 1, otherwise. 

Therefore C must be comprised of these 7'-^|C|A^ concepts and possibly Cmaj- Thus we obtain: 



7^|C|7V + 1 > Id ==»7^ > -^777— > 



C ^ l^^l - 1 ^ N _ 1 



\C\N - {N + l)N N + 1 



Definition III. 9 A subset of inputs I C {0, 1}" is said to satisfy the semi-rich row condition for C if at least half 
the concepts in C have the property that they yield 1 for at least a 7*^ fraction of the inputs x in X . 

The phrase "semi-rich row condition" is used because viewing C as a matrix, at least half the rows of C are "rich" 
in Is (have at least a 7*^ fraction of Is) within the columns indexed by inputs in X. A simple greedy approach can be 
used to construct a set of inputs which satisfies the semi-rich row condition for C: 

Lemma III. 10 Let C be any concept class with \C\ > 2. Then Algorithm 1 outputs a set of inputs X with \X\ < 
which satisfies the semi-rich row condition for C . 

Proof: Let Tj\C\ be the number of concepts in C \ 5 after the j-th execution of the repeat loop in Algorithm 1, so 
To = 1. Using the first result of Lemma Fill. 71 we obtain ri < 1 — 7'-^. Now invoking Lemma [111.71 once again but this 
time in its general form, we obtain r2 < (1 — 7*^)^; note that after the second iteration of the loop, each concept in S 
will yield 1 on at least half of the elements in X. Let j' equal \^\ . If Algorithm 1 proceeds for j' iterations through 

the loop, then it must be the case that Tjv < (1 — 7'^)-' and that each concept in S yields 1 on at least a 7*^ fraction 
of the elements in X. It is easy to verify that (1 — a;)Li/^J < 1/2 for < a; < |. We thus have that Tji\C\ < \C\/2, 
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Algorithm 1: Constructing a set of inputs which satisfies the semi-rich row condition. 

0, 0. 
repeat 

Perform a column flip on C \ S to make C \ S be 1-sensitive. 

cimax ^ the input in {0, 1}" \ X at which the highest fraction of concepts in C \ 5 yield 1. 

I ^lU {flmax}. 

5 ^ 5U the set of concepts in the original matrix C that yield 1 at input Umax ■ 
until |S| > \C\/2. 
Output T. 



and consequently \S\ > |C|/2, so X satisfies the semi-rich row condition for C and the algorithm will terminate before 
starting the {[:^J + l)-th iteration. 

In the case 7*-^ < j^, then the set of all N inputs will satisfy the semi-rich row condition for C: since any concept 
which does not yield for all inputs actually yields 1 for at least a 7*^ fraction of all inputs. Therefore in this case the 
algorithm will terminate successfully with an output \T\ < N < Otherwise, since •y^ > we have that j' < N. 

This means the algorithm never runs out of inputs to add (i.e. {0, 1}^ \X is nonempty at every iteration). ■ 

Our quantum learning algorithm is given in Algorithm 2. Throughout the algorithm the set S C C should be 
viewed as the set of possible target concepts that have not yet been eliminated; the algorithm halts when \S\ — 1. 
The high-level idea of the algorithm is that in every repetition of the outer loop the size of S is multiplied by a factor 
which is at most i, so at most log |C| repetitions of the outer loop are performed. 



Algorithm 2: A Quantum Exact Learning Algorithm. 

repeat 

Perform a column flip on S to make S 1-sensitive. Let IC be the set of inputs at which the output is flipped during this 
procedure. 

T <— The output of Algorithm 1 invoked on the set of concepts S. 

Counter <— 0, Success*— False. 

repeat 

Perform the multi-target subset Grover search on I using |[y^jx[] queries 0. 

a <— Result of the Grover search. 

if a classical query of the oracle at a yields 1 then 

5* <— {the concepts in S that yield 1 at a}, Success^True. 
end if 

Counter*— Count er-f 1 . 
until Success Or (Counter>= log(3 log |C|)) 
if Not Success then 

S <— the set of concepts that yield for all of the inputs in I. 
end if 

Flip the outputs of concepts in 5* for all elements in IC to reverse the earlier column flip (thus restoring all concepts in 5* 
to their original behavior on all inputs), 
until IS"! = 1. 

Output ^ A representation of a circuit which computes the sole concept c in S. 



Theorem III. 11 Let C be any concept class with \C\ > 2. Algorithm 2 is a quantum exact learning algorithm for C 
which performs 0{ ^°^ ^'^^ i°gi°g 1^*1 ^ quantum membership queries. 



Proof: Consider a particular iteration of the outer Repeat-Until loop. The set S is 1-sensitive by virtue of the first 
step (the column flip) . By Lemma IIII.IOI in the second step of this iteration, T becomes a set of at most J^- many 
inputs which satisfies the semi-rich row condition for S. Consequently, each execution of the Grover search within the 
inner Repeat-Until loop uses (which is also -^)) many queries. Since the inner loop repeats at most 

log(3 log |C|) many times, if we can show that each iteration of the outer loop does indeed with high probability (i) 
cause the size of S to be multiplied by a factor which is at most ^, and (ii) maintain the property that the target 
concept is contained in S*, then the theorem will be proved. 
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As shown in the multi-target Grover search algorithm over a space of \T\ many inputs using many 
queries has the property that if there is any input a €! X on which the target function yields 1, then the search will 
output such an a with probability at least ^. Since the inner loop repeats log(31og|C|) many times, we thus have 
that if there is any input a G X on which the target concept yields 1, then with probability at least 1 — g^^^ j ^^ j one of 
the log(31og |C|) many iterations of the inner loop will yield such an a and the "Success" variable will be set to True. 
Since the set S is 1-sensitive, when we eliminate from 5* all the concepts which yield at a we will multiply the size 
of S by at most ^ as desired in this case (and clearly we will not eliminate the target concept from S). On the other 
hand, if the set I contains no input a on which the target concept yields 1, then after log(31og |C|) iterations of the 
inner loop we will exit with Success set to False, and the concepts that yield 1 on any input in X will be removed 
from S. This will clearly not cause the target to be removed from S. Moreover because \I\ < Jj-, any concept for 

which even a single input in X yields 1 has the property that at least a 7"^ fraction of inputs in X yield 1. Since X 
satisfies the semi-rich row condition for S, this means that we have eliminated at least half the concepts in S. Thus, 
the algorithm will succeed with probability at least (1 — 3iog|(7| )'°^'^ which is larger than 2/3, and the theorem is 
proved. ■ 

Recall that Q{C) denotes the optimal query complexity over all quantum exact learning algorithms for concept class 
C. We can show that the query complexity of Algorithm 2 is never much worse than the optimal query complexity 
Q{C): 

Corollary III. 12 For any concept class C, Algorithm 2 uses 0(7i(5(C)^ loglog |C|) queries. 

Proof: This follows directly from Theorem ITTlTTI and the bound Q{C) = flC-^^ + — ^) of Theorem llTL4l ■ 



Since \C\ < 2^", the bound 0(nQ(C)^ loglog |C|) is always 0{n^Q{CY), and thus the query complexity of Al- 
gorithm 2 is always polynomially related to the query complexity of the optimal algorithm for any concept class 
C. 



C. Discussion 



Algorithm 2 can be viewed as a variant of the algorithm of [ij which learns any concept class C from 
0{^J N log \ C\ log log log |C|) quantum membership queries. This algorithm repeatedly performs Grover search over 
the set of all inputs, with the goal each time of eliminating at least half of the remaining target concepts. Instead, our 
approach is to perform each Grover search only over sets which satisfy the semi-rich row condition for the remaining 
set of possible target concepts. By doing this, we are able to obtain an upper bound on query complexity in terms of 
7'-^ for every such iteration. 

We observe that our new bound of (^°^ ^'^^ i°gi°g IC'I ^ jg stronger than the previously obtained upper bound of 

0{\/ N log \C\ log A^ log log |C|) from [3| as long as — o{NlogN). Thus, for any concept class C for which the 

0( '°?|F^ ) upper bound of Theorem lIII.3l on classical membership query algorithms is nontrivial (i.e. is less than A^), 
our results give an improvement. 

We note that independently Iwama et al. 18] have recently given a new algorithm for quantum exact learning 
that uses ideas similar to the construction of Algorithm 1; however the analysis is different and their results are 
incomparable to ours (their bounds depend only on the number of concepts in C and not on the combinatorial 
parameter 7*^). The main focus of 18] is on obtaining robust learning algorithms that can learn successfully using 
noisy oracle queries. 



IV. RELATIONS BETWEEN QUERY COMPLEXITY OF QUANTUM AND CLASSICAL EXACT 

LEARNING 

As noted in '2^, combining Theorems IIII.3I and IIII.4I vields the following: 
Corollary IV. 1 For any concept class C, we have Q{C) < R{C) = 0{nQ{CY). 

Can tighter bounds relating R{C) and Q{C) be given which hold for all concept classes CI While we have not 
been able to answer this question, here we make some simple observations and pose a question which we hope will 
stimulate further work. 

We first observe that the factor n is required in the bound i?(C) = 0(nQ(C)^): 
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Lemma IV. 2 For any positive integer d there exists a concept class C over {0,1}" with R{C) = which has 

R{C)=n{Q{C)''). 

Proof: We assume d > 1. Recall that in the Bernstein- Vazirani problem, the target concept is an unknown parity 
function over some subset of the n Boolean variables xi,...,a;„; Bernstein and Vazirani showed Q that for this 
concept class we have R(C) = n whereas Q{C) — 1. We thus consider a concept class in which each concept c contains 
n^/'* copies of the Bernstein- Vazirani problem (each instance of the problem is over n'''^~^^/'^ variables) as follows: we 
view 71-bit strings a, x as 

a — (ai,l7 0^1,2, ■ • ■ , ai,„(d-i)/£i, a2,l, 02,2, ■ • ■ , a2,n<''-i)/<*i • ■ • '^n^/'^,li ■ ■ ■ i '^ni/'l, 

X = {xi^l,Xi,2, ■ ■ ■ , Xij^(d~l)/d,X2,l,X2,2, ■ ■ ■ , X2^n(d-l) /d , . . . X^(i/d)^i, . . . , X^i/d ^„(d- i)/d ) 

The class C consists of the set of all 2" concepts: 

faix) = \/ ((aj,i, ai,2, ■ • ■ ,aj_„(d-i)/d) • {xt^i,Xi^2, ■ • ■ , a;j_„(d-i)/d) mod 2) 

i=l 

i.e. fa{x) equals 1 if any of the n^^'^ parities corresponding to the substrings a^^. take value 1 on the corresponding 
substring of x. 

It is easy to see that n queries suffice for a classical algorithm, and by Theorem IIII. 21 we have R{C) — il{log\C\), 
so R{C) = Q{n). On the other hand, it is also easy to see that Q — 0{ni) since a quantum algorithm can learn by 
making n^^"^ successive runs of the Bernstein- Vazirani algorithm. 

Finally, if d = 1 then as shown in the set C of aU 2^" concepts over {0, 1}" has Q(C) = 6(2") and i?(C) = 9(2"). 

■ 

The bound R{C) = 0{nQ{C)^) implies that the gap of Lemma [lV.21 can only be achieved for concept classes C 
which have R{C) small. However, it is easy to exhibit concept classes which have a factor n difference between R{C) 
and Q{C) for a wide range of values of R{C): 

Lemma IV. 3 For any k such that n— k — 0(n), there is a concept class C with R{C) — 8(ri2'^) and Q{C) — Q{2^). 

Proof: The concept class C is defined as follows. A concept c £ C corresponds to (a°, . . . , 0^*°^^), where each a* is 
a (n — fc)-bit string. The concept c maps input x G {0, 1}" to (a* • y) mod 2, where i is the number between and 
2*^ — 1 whose binary representation is the first k bits of x and y is the (n — fc)-bit suffix of x. Since each concept in C 
is defined uniquely by 2*^ many {n — fc)-bit strings a'^, . . . , a^*"^^, there are 2^''("^'^) concepts in C. 

Theorem intSl yields R{C) = n{2''{n - k)). It is easy to see that in fact R{C) = e(2'=(n - k)): For each of the 2*^ 
parities which one must learn (corresponding to the 2*^' possible prefixes of an input), one can learn the [n — fc)-bit 
parity with n — k classical queries. 

It is also easy to see that by running the Bernstein- Vazirani algorithm 2*^ times (once for each different fc-bit prefix) , 
a quantum algorithm can learn an unknown concept from C exactly using 2*^ queries, and thus Q{C) = 0{2''). The 
Q{C) = r2(i2£i£i) lower bound of Theorem ITTO gives us Q(C) = ■ 2^=) = f7(2'=), and the lemma is proved. ■ 

Based on these observations, we pose the following question: 
Question IV. 4 Does every concept class C satisfy R{C) — 0{nQ{C) + Q(C)^)? 

Note that the example in Lemma FlV. 31 and the concept class of Grover search [l^: C = {fi, < i < N : fi{x) — Si^x} 
saturate this upper bound. 

V. ON LEARNING A PARTITION OF A CONCEPT CLASS 

Definition V.l Let C he a concept class over {0, 1}". A partition CP of C is a collection of nonempty disjoint subsets 
Pi, . . . , Pfc whose union yields C . 

In this section we study a different problem, mentioned by Ambainis et al. Jj, that is more relaxed than exact 
learning: given a partition CP of C and a black-box (quantum or classical) oracle for an unknown target concept c in 
C, what is the query complexity of identifying the set Pi in CP which contains c? It is easy to see that both the exact 
learning problem (in which |CP| = |C|) and the problem of computing some binary property of c (for which |CP| = 2) 
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are special cases of this more general problem. One can view these problems in the following way: for the exact 
learning problem the algorithm must obtain all log |C| bits of information about the target concept, whereas for the 
problem of computing a property of c the algorithm must obtain a single bit of information. In a general instance of 
the partition problem, the algorithm must obtain loglTj bits of information about the target concept. 

Given a concept class C and a partition T of C, we will write R'y{C) to denote the optimal query complexity of 
any classical (randomized) algorithm for the partition problem which outputs the correct answer with probability at 
least 2/3 for any target concept c. We similarly write Qy(C) to denote the optimal complexity of any quantum query 
algorithm with the same success criterion. 

As noted earlier, for the case |CP| = |C| we know from Corollary IIV. II that the quantities R'p{C) and (3g>(C) are 
polynomially related for any concept class C, since i?g>(C) = 0{nQ'j,{C)^). On the other extreme, if |T| — 2 then 
concept classes are known for which i?y(C) and QrpjC) are polynomially related (see e.g. 0), and concept classes 
are also known for which there is an exponential gap |22l |. It is thus natural to investigate the relationship between 
the size of |CP| and the existence of a polynomial relationship between i?y(C) and Qy{C). 

In this section, we show that the number of sets in |CP| alone (viewed as a function of |C|) often does not provide 
sufficient information to determine whether i?y(C) and Qy{C) are polynomially related. More precisely, in Scction lV Al 
we show that for any concept class C over {0, 1}" and any value 2 < k < \C\, there is a partition T of C with ITj = k 
for which we have R'j>{C) = 0{nQy{G)'^). On the other hand, in Section IVBI we show that for a wide range of values 
of |CP| (again as a function of |C|), there are concept classes which have a superpolynomial separation between i?y(C) 
and Qy{C). Thus, our results concretely illustrate that the structure of the partition (rather than the number of the 
sets in the partition) plays an important role in determining whether the quantum and classical query complexities 
are polynomially related. 



A. Partition Problems for which Quantum and Classical Complexity are Polynomially Related 

The following simple lemma extends the cardinality-based lower bounds of Theorem IIII.2I and Theorem IIII.4I for 
exact learning to the problem of learning a partition: 

Lemma V.2 For any partition T of any concept class C over {0,1}", we have i?y(C) — r2(log|CP|) and Q'p{C) = 

Proof: Let C" C C be a concept class formed by taking any single element from each subset in the partition 7. 
Learning V requires at least as many queries as exact learning the concept class C", and so the result follows from 
Theorem llIL2l and Theorem HlO ■ 

To obtain a partition analogue of the other lower bounds of Theorems IIII.2I and IIII.4I we define the following 
combinatorial parameter which is an analogue of 7*^: 

Definition V.3 Let § be the set of all subsets C C C , \C'\ > 2 which have the property that any subset C" C C" 
with \C"\ > ||C"| must intersect at least two subsets in T. We define 7^ to be 7y := minces 1^ ■ 

Thus each subset C in § has the property that the partition induced on C by T contains no subset of size as large 
as f |C"|. 

The next lemma shows that for each C G S, the lower bounds for exact learning R{C') ~ f^(;^) and Q{C') — 
il[—-^=) which are implied by Theorems IIII.2I and IIII.4I extend to the problem of learning a partition to yield 

R^{C') = ni-^r) and Qt{C') = By conside ring the C" G S which minimizes 7*^ , we obtain the strongest 

lower bound (this is the motivation behind Definition IV. 3|) . 

Lemma V.4 For any partition CP = Pi,...,Pk of the concept class C, we have Ry{C) — and Qy{C) = 



Proof: Let C" S 8 be such that 7^ = 7*^ . We consider the problem of learning the partition induced by ? over 
C", and shall prove the lower bound for this easier problem. We may assume without loss of generality that C" is 
1-sensitivc. 

We first consider the classical case. We claim that there is a partition {5*1, of C" with the property that each 
subset {Pj n C") is contained entirely in exactly one of 81,82 (i.e. 81,82 is a "coarsening" of the partition induced 
by ? over C") which satisfies mini=i,2 \8^\ > j\C'\. To see this, we may start with 52 = 0, S'l = U*Li(Pi n C") = C" 
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and consider a process of "growing" S2 by successively removing the smallest piece Pj H C from Si and adding it to 
82- W.l.o.g. we may suppose that \Pj (1 C'\ < \Pj+i n C'\ for all j, so the pieces Pj n C are added to 6*2 in order 

of increasing j = 1,2, Let t be the index such that adding Pt n C to S2 causes |S'2| to exceed \\C'\ for the first 

time. By Definition IV. 31 it cannot be the case that adding Pt n C" causes S2 to become all of C (since this would 
mean that Pt n C" is a subset of size at least ||C"| which intersects only Pt); thus it must be the case that after this 
t-th step Si is still nonempty. However, it also cannot be the case that after this t-th step we have \Si\ < ||C"|; for if 
this were the case, then after the t-th step we would have \Pt n C'\ > ||C"| > IS"!] = u'^^t+iiPj ^ this would 

violate the assumption that sets are added to S2 in order of increasing size. 

Since C is 1-sensitive, the "worst case" for a learning algorithm is that each classical query to the target concept 
(some c g C") yields 0. By definition of 7'-^ , each such query eliminates at most 7*-^ ■ \C'\ many possible target 
concepts from C . Consequently, after [;^J — 1 classical queries, the set of possible target concepts in C" is of size at 

least ||C"|, and so it must intersect both Si and 52- It is thus impossible to determine with probability greater than 
1/2 whether c belongs to Si or 6*2, and thus which piece Pi of 7 contains c. This gives the classical lower bound. 
Our analysis for the quantum case requires some basic definitions and facts about quantum computing: 

Definition V.5 // \4>) = J^z'^zlz) and \ip) — J2zl^^\^) ^'"'^ ^'"'^ superpositions of basis states, then the Euclidean 
distance between and li/;) is |||^) — |'0)ll — (X^z I'^z— Pzl'^Y^'^ ■ T/ie total variation distance between two distributions 
Vi and I?2 is defined to be \T^i{x) — V2{x)\. 

Fact V.6 (See 0) Let |0) and be two unit length superpositions which represent possible states of a quantum 
register. If the Euclidian distance \\\<j)) — is at most e, then performing the same observation on \(f>) and |^) 

induces distributions and which have total variation distance at most 4e. 

For the quantum lower bound, suppose we have a quantum learning algorithm which makes at most T = [ — ^-j^=.\ — 1 

quantum membership queries. We will use the following result which combines Theorem 6 and Lemma 7 from [2(tI | 
(those results are in turn based on Theorem 6.6 of Q): 



Lemma V.7 (See [20]) Consider any quantum exact learning algorithm Af which makes T quantum membership 
queries. Let denote the state of the quantum register after all T membership queries are performed in the 

algorithm, if the target concept is c. Then for any 1-sensitive set C of concepts with |C"| > 2 and any e > 0, there 
is a set S C C' of cardinality at most T'^\C'\j^ /e^ such that for all c E C \ S , we have \\\(f>^) — < e (where 

denotes the identically concept). 

If we take ^ — then Lemma FV. 71 implies that there exists a set 5 C C" of cardinality less than j ■ \C'\ such that for 
all c e C'\S one has |j |(/)° ) — \(t)x)\\ < Consequently by Definition lV. 31 there must exist two concepts ci, C2 G C'\S 
with — 102" )|| < jq which belong to different subsets Pi and Pj of 7. By Fact IV.61 the probability that our 

quantum learning algorithm outputs "i" can differ by at most j when the target concept is ci versus C2; but this 
contradicts the assumption that the algorithm is correct with probability at least 2/3 on all target concepts. This 
proves the quantum lower bound. ■ 

Before proving the main result of this section, we establish the following result which gives a sufficient condition for 
the quantum and classical complexities Q'p{C) and R'p[C) of learning a partition to be polynomially related. This 
result is a generalization of Corollarv lIV.il 

Corollary V.8 For a partition CP over the concept class C , if the size of the largest subset Pi in f is less than 
then we have Ry{C) = OinQyiCf). 

Proof: Let C be a subset of C for which 7"^' equals 7*^. We have that \C'\ > by Definition lIII.il Thus any | 

fraction of C must intersect at least two subsets in T, so C" must belong to §. This forces 7y = 7*^ = 7'^- Moreover, we 
have that IT] > ^j'^' • |C| , and we know that <j'^ < ^hy Lemma UlLSl Thus we have log \V\ > log | - n + log |C| , 

and consequently > ^-^^ - 1. Lemmas and El yield gg>(C) = ^C-^^ + ^) = f^(i2£M + 

Combining this with the bound Ry{C) — 0( '°f jj^^ ) (which clearly follows from Theorem 1111.31 since the partition 
learning problem is no harder than the exact learning problem), we have that Ry{C) must be 0{nQy{CY). ■ 

We note here that we could have used any constant A satisfying | < A < 1 in Definition IV. 31 in place of 3/4, and 
obtained corresponding versions of Lemma |V.4I and the above corollary with A in place of 3/4. 



11 



Algorithm 3: A slightly modified version of Algorithm 1 to be used in generating a partition. 
Require: C is 1-scnsitive. 
^ 0, T ^ 0, J ^ 0. 
repeat 

Perform a column fiip on C \ i? to make it 1-sensitive; call the resulting 1-sensitive matrix M. 
Umax <— the input in {0, 1}" \ T at which the highest fraction of concepts in C \ R yield 1. 

I ^lU {anmx}- 

R ^ RU the set of concepts in M that yield 1 at input Umax- 

if the column corresponding to ttmax in M was fiipped relative to C then 

end if 
until \R\ > \C\/2. 
Output ^ 



Now we prove our main result of this subsection, showing that for any concept class C and any partition size 
bound 2 < k < \C\ there is a partition of C into k pieces such that the classical and quantum query complexities are 
polynomially related: 

Theorem V.9 Let C be any concept class and k any integer satisfying 2 < k < \C\. Then there is a partition VofC 

with \y\ = k for which we have Ry{C) — 0{nQy{CY). 

Proof: We will show that Algorithm 4 constructs a partition T with the desired properties. Algorithm 4 uses a 
slightly modified version of Algorithm 1, which we call Algorithm 3. Algorithm 3 differs from Algorithm 1 in that if 
the input Omax corresponds to a column which is flipped in the column flip on C \ i?, then Algorithm 3 augments R 
by adding those concepts in the flipped version oiC\R which yield 1 on amax (note that by 1-sensitivity this is fewer 
than half of the concepts in C\R), whereas Algorithm 1 adds those concepts which yield 1 on Omax in the unflipped 
(original) version oi C\R. Thus at each stage Algorithm 3 grows the set R by adding at most half of the remaining 
concepts in C \ i?; we will need this property later. The analysis of Algorithm 1 carries over to show that the set I 
of inputs which Algorithm 3 constructs is of size at most \T\< J^. 

At each iteration of the outer repeat loop, Algorithm 4 successively refines the partition Q until |Q| = k. Let C C C 
be such that = 7'-^. The first time Algorithm 4 passes through the inner repeat loop we will have |C| = l^l and 
thus Algorithm 3 will be invoked on C . We will write C° ,C* to denote these sets 5°, S* of concepts that are formed 
out of C in this flrst iteration. The final partition f will ultimately be a refinement of the partition {C°, C*} obtained 
in this step; we will see later that this will force 7^ = (this is why the first iteration is treated differently than 
later iterations). 

In addition to constructing the partition CP, the execution of Algorithm 4 should also be viewed as a "mcmoization" 
process in which various sets of inputs T{S), Sf{S) and K,{S) are defined to correspond to different sets of concepts S. 
These sets will be used during the execution of Algorithm 5 later. Roughly speaking, the division of S in each iteration 
depends only on the values on inputs in I{S), the set J{S) is used to keep track of the column flips Algorithm 3 
performs, and the set 1C{S) keeps track of those inputs which need to be flipped to achieve 1-sensitivity. 

We now explain the outer loop of Algorithm 4 in more detail. The algorithm works in a breadth-first fashion to 
successively refine the partition Q, which is initially just {C}, into the final partition CP. After the first iteration of 
the outer loop, C has been partitioned into {C°, C*}. Similarly, in the second iteration each of these sets is divided 
in two to give a four-way partition. The algorithm continues in this manner until the desired number of elements in 
the partition is reached. The main idea of the construction is that each division of a set S (after the first iteration) 
creates two pieces S° and S* of almost equal size as we shall describe below. Because degenerate divisions do not 
occur, we will see the algorithm will terminate after at most O(logfc) iterations of the outer loop. 

Recall from above that each invocation of Algorithm 3 in Algorithm 4 on a set S of concepts yields a set T{S) of at 
most many inputs. By flipping the output of concepts in S at inputs in J{S) in Step 14 of Algorithm 4, we ensure 
that the sets 5° and S* defined in steps 15 and 16 correspond precisely to the sets C\R and R of Algorithm 3 when 
it terminates. It thus follows from the termination condition of Algorithm 3 that |S'*|/|S'| > ^. Recall also from the 
discussion in the first paragraph of this proof that the last iteration of the loop of Algorithm 3 adds at most half of 
the remaining concepts into the set R. Therefore we have that the set S° in Algorithm 4 must satisfy |S'°|/|S'| > j. It 
follows from these bounds on |S'°|/|S'| and |S'*|/|S'| that Algorithm 4 makes at most 0(logA:) many iterations through 
the outer loop. 

It is therefore clear that at each iteration of the main loop of Algorithm 4 for which \S\ > 2, each of the sets S° and 
S* formed from S will be nonempty. This ensures that the algorithm will keep producing new elements in the partition 
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Algorithm 4: Constructing a partition for which (C) and Qy (C) are polynomially related. 

1; Q ^ {C} 
2: repeat 
3: 3?^0. 
4: repeat 

5: S ^ an element in Q 

6: if 15"! > 2 then 

7: if |C| = 151 then 

8: Let IC{S) denote the inputs which, if flipped, would make C" be 1-sensitive (C" is defined in the 2nd paragraph of 

the proof of Theorem I V.9II . Flip the values of concepts in S at inputs in JC{S). 
9: {I{S),J{S)) ^ The output of Algorithm 3 invoked with C". 

10: else 

11: Let IC{S) denote the inputs which, if flipped, would make S be 1-sensitive. Flip the values of concepts in 5* at 

inputs in IC{S). 

JiS)) *— The output of Algorithm 3 invoked with input S. 

end if 

Flip the values of concepts in S at those inputs in J7{S). 
S° ^ {the concepts in S that yield for each x G I{S)}. 
S* <- S\S°. 

Flip the values of concepts in S° , S* for all elements in J{S). 
Flip the values of concepts in S° , S* for all elements in IC(S). 
7i^:Ru{S°,S*}. 
else 

K ^ {S}. 

end if 

Q^Q\{S}. 
until Q = Or |Q| + = k. 
Q «- Q U 3?. 
until |Q| — k. 
7 ^Q. 



until IT I — k is reached. The same argument shows that C°,C* are each nonempty and satisfy |C° n C"|/|C"| > ^ 
and |C* n C"|/|C"| > i. This implies that C is an element of the set § of Definition |V3| any subset C" C C with 
> j\C'\ must intersect both C° and C* , and thus must intersect at least two subsets of T since T is a refinement 
of {C°, C*}. Consequently we have 7^ = . 

To show that the partition 7 satisfies R'j>{C) = 0{nQ'j>{C)^), we now give an analysis of the query complexity of 
learning ? with both classical and quantum resources. As we will see, we need to give a classical upper bound and a 
quantum lower bound to obtain our goal. 

In the classical case, we will show that Algorithm 5 makes 0( '°?6^^ ) queries and successfully learns the partition T. 
Using the sets T{S) which were defined by the execution of Algorithm 4, Algorithm 5 makes its way down the correct 
branch of the binary tree implicit in the successive refinements of Algorithm 4 to find the correct piece of the partition 
which contains c. More precisely, at the end of the t-th iteration of the outer loop of Algorithm 5, the set S which 
Algorithm 5 has just obtained will be identical to the piece c resides in of the partition constructed by Algorithm 4 
at the end of the t-th iteration of its outer loop. As shown above, it takes 0(logA;) = 0(log ITj) iterations until the 
subset which the target concept c lies in is reached. Moreover, by the same argument in Lemma Fill. 101 Algorithm 3 
always outputs a set of inputs T{S) with size at most Js- < when invoked on a set of concepts S. Therefore at each 

of these 0(log ITj) iterations Algorithm 5 makes at most ^ many queries. Thus Algorithm 5 is a classical algorithm 
that learns 7 using 0( '°?c^^ ) queries, so we have Ry{C) — 0( '°?i^^ ). 

In the quantum case: since we have 7^ = 7*^, by Lemma |V.4I any quantum algorithm learning T should perform 
n( / ) quantum membership queries. Combining this result with that of Lemma IV. 21 we have that Q'y>{C) = 

fl^]£s\^ _| i=). Combining this inequality with the classical upper bound i?y = 0( '°-c^^ ) from Algorithm 5, we 

" Vt"^ 7- 
have that Ry{C) = 0{nQ'j>{C)^) for this partition CP, and we are done. ■ 
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Algorithm 5: A classical algorithm learning f. 

S^C 
repeat 

Flip the values of concepts in 5* at those inputs in IC{S). 

Flip the values of concepts in 5* at those inputs in J7{S). 

Classically query the given oracle implementing c at all elements in I{S). 

Z ^True if c yields for all elements in T{S). Z ^False otherwise. 

if Z then 

S <— {the concepts in 5* that yield for each x G X(S')}. 
else 

S <— {the concepts in S that yield 1 for at least one x £ X(S')}. 
end if 

Flip the values of concepts in 5* at those inputs in J(S\ 

Flip the values of concepts in 5* at those inputs in IC(S). 
until 5 e T. 
Outputs S. 



B. A Partition Problem with a Large Quantum-Classical Gap 

In the previous subsection we showed that for any concept class and any partition size bound, there is a partition 
problem for which the classical and quantum query complexities are polynomially related. In this section, by adapting 
a result of Simon we show that for a wide range of values of the partition size bound, it is possible for the classical 
query complexity to be superpolynomially larger than the quantum query complexity: 

Theorem V.IO Let n — m + logm. For any value 1 < i < m, there is a concept class C over {0, 1}" with \C\ < 
2me-e^+e+m2'^-' ^ partition 7 of C with |T| > 2™*^-^"-^ such that Ry{C) = f7(2™-^) and Qy(C) =poly(m)). 

Taking i — m — a{m) where a{m) is any function which is a;(logm), we obtain Ry{C) — m'^^^-' whereas 
Ql>{C) =poly(m), for a superpolynomial separation between classical and quantum query complexity. Such a choice of 
e gives l^l = 2"(™) whereas |C| is roughly 2™-2°*'"' . Note that the size of |C| can be made to be 2''('") for any slightly 
superpolynomial function /3(m) via a suitable choice of a(m) = a;(logm). This means that viewed as a function of 
|C|, it is possible for \'J'\ to be as large as 2('°sl'^l)^ for any function e(-) = o(l) and still have the classical query 
complexity be superpolynomially larger than the quantum query complexity. 

Proof of Theorem IV.lOl We will use a result of Simon who considers functions / : (0, l}™ (0, l}™. Any 
such function / : {0, 1}'" (0, 1}™ can equivalently be viewed as a function / : ({0, 1}™ x (1, . . . , m}) (0, 1} 
where f{x,j) equals f{x)j, the j-th bit of f{x). It is easy to see that we can simulate a call to an oracle for 
/ : (0, 1}™ (0, 1}™ by making m membership queries to the oracle for /, in both the classical and quantum case. 
This extra factor of m is immaterial for our bounds, so we will henceforth discuss functions / which map {0, 1}™ to 
{0,1}™. 

We view the input space {0, 1}™ as the vector space F™. Given a ^-dimensional vector subspace V C F™, we say that 
a function / : {0, 1}™ {0, 1}™ is V -invariant if the following condition holds: /(x) = /(y) if and only if a; = y © u 
for some v G V. Thus a F-invariant function / is defined by the 2™"^ distinct values it takes on representatives of the 
2m-£ cosets of V. The concept class C is the class of all functions / which are ^-invariant for some ^-dimensional 
vector subspace V of F™. 

A simple counting argument shows that there are 

(2™ - 1)(2™ - 2)(2™ - 4) • • • (2™ - 2^"i) 
^'"'^ = (2^ - 1)(2^ - 2)(2^ - 4) • • • (2^ - 2^-1) 

many ^-dimensional subspaces of F™. This is because there are (2™ — 1)(2™ — 2)(2™ — 4) • • • (2™ — 2^~^) ways to 
choose an ordered list of £ linearly independent vectors to form a basis for V, and given any V there are (2^ — 1)(2^ — 
2) (2^ — 4) • • • (2^ — 2^~^) ordered lists of vectors from V which could serve as a basis. 

We define the partition T to divide C into Njn.i equal-size subsets, one for each ^-dimensional vector subspace V; 
the subset of concepts corresponding to a given V is precisely those functions which are V^-invariant. For any given 
^-dimensional subspace V, the number of functions that are ^-invariant is 

= 2" (2" - 1)(2" - 2) • • • (2" - 2"-^ + 1) 
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since one can uniquely define such a function by specifying distinct values to be attained on each of the 2"* coset 
representatives. 

Therefore we have |C| = N^,,, ■ Im,t, and it is easy to check that 2"^-^'-^ < N^j < 2"'^-^'+^ and Ira.e < 2™2'""'. 
It remains only to analyze the quantum and classical query complexities. 

For the quantum case, it follows easily from Simon's analysis of his algorithm in |22i | that for any V^-invariant /, each 
iteration of Simon's algorithm (which requires a single quantum query to /) will yield a vector that is independently 
and uniformly drawn from the (m— £)-dimensional subspace T^"*". A standard analysis shows that after 0{m) iterations, 
with very high probability we will have obtained a set of vectors that span V-^; from these it is easy to identify V 
and thus the piece of the partition to which / belongs. 

For the classical case, an analysis much like that of Section 3.2) can be used to show that any classical 

algorithm which correctly identifies the vector subspace V with high probability must make 2^^"^~^^ many queries; 
since the proof is similar to we only sketch the main ideas here. We say that a sequence of queries is good if it 
contains two distinct queries which yield the same output (i.e. two queries x, y which have {x(By) € V), and otherwise 
the sequence is bad. The argument of applied to our setting shows that if the target vector subspace is chosen 
uniformly at random, then any classical algorithm making M = 2'-™"^-'/'^ queries makes a good sequence of queries 
with very small probability. On the other hand, if a sequence of M = 2('"~^)/'^ queries is bad, then this restricts the 
possibilities for V by establishing a set S of (^^) < 2^^™"^'/'^ many "forbidden" vectors from which must not 
belong to the target vector space V (since for each pair of elements x,y in M we know that {x(By) ^ V^). Given a fixed 
nonzero vector z G , we have that a random ^-dimensional vector space W contains z with probability 2rn~}i < ^ > 

and consequently the probability that a random W contains any element of S is at most 2^^™^^)/'^ • ^ = 2(^^'")/3^ 
which is less than 1/2 if ^ < m — 6 (and ii £ > m — 6 the bound of the theorem is trivially true). Thus at least half of 
the Nm^e possible ^-dimensional vector subspaces are compatible with any given bad sequence of 2'™"^^/'^ queries, so 
the classical algorithm cannot have identified the right subspace with nonnegligible probability. ■ 

VI. QUANTUM VERSUS CLASSICAL PAC LEARNING 

A. The Quantum PAC Learning Model 

The quantum PAC learning model was introduced by Bshouty and Jackson in A quantum PAC learning 
algorithm is defined analogously to a classical PAC algorithm, but instead of having access to a random example 
oracle EX{c,'D) it can (repeatedly) access a quantum superposition of labeled examples. The goal of constructing 
a classical Boolean circuit h which is an e-approximator for c with probability 1 — (5 is unchanged. More precisely, 
for V a distribution over {0, 1}" we say that the quantum example oracle QEX{c,'D) is a gate which transforms the 
computational basis state |0",0) as follows: 

r,0)^ ^ Vv{^\x,c{x)). 
xe{o,i}" 

We leave the action of a QEX{c^T>) gate undefined on other basis states, and we require that a quantum PAC 
learning algorithm may only invoke a QEX{c,T>) oracle on the basis state |0",0). It is easy to verify (see 9]) that a 
QEX{c,'D) oracle can simulate a classical EX{c,'D) oracle. 

As noted in Lemma 6 of 01, we may assume without loss of generality (by renumbering qubits) that all QEX{c,D) 
calls of a quantum PAC learning algorithm occur sequentially at the beginning of the algorithm's execution and 
that the t-th. invocation of QEXlc^V) affects the qubits {t - l){n + 1) -f 1, (t - l){n + 1) -I- 2, . . . , t(n + 1). After 
all T QEX{c^'D) queries have been performed, the algorithm performs a fixed unitary transformation and then a 
measurement takes place. (See ^9, 20] for more details on the quantum PAC learning model.) The quantum sample 
complexity is the number of invocations T of QEX which the quantum PAC learning algorithm performs, i.e. the 
number of QEX gates in the quantum circuit corresponding to the quantum PAC learning algorithm. 

The following definition plays an important role in the sample complexity of both classical and quantum PAC 
learning: 

Definition VI. 1 If C is a concept class over some domain X and W C X, we say that W is shattered by C if for 
every W' C W , there exists ac G C such that W' = cOW . The Vapnik-Chervonenkis dimension of C, VC— DIM{C), 
is the cardinality of the largest W C X such that W is shattered by C. 
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B. Known Results on Quantum versus Classical PAC Learning 

The classical sample complexity of PAC learning has been intensively studied and nearly matching upper and lower 
bounds are known: 

Theorem VI. 2 (i) PJH / Any classical (e, S)-PAC learning algorithm for a non-trivial concept class C of VC dimension 
d must have classical sample complexity r2(i log j + f)- (ii) 0/ Any concept class C of VC dimension d can he (e, i5)- 
PAC learned by a classical algorithm with sample complexity log j- + 7 log 

Servedio and Gortler f2^ gave a lower bound on the quantum sample complexity of PAC learning. They showed 
that for any concept class C of VC dimension d over {0, 1}", if the distribution T) is uniform over the d examples in 
some shattered set S, then even if the learning algorithm is allowed to make quantum membership queries on any 
superposition of inputs in the domain S', any algorithm which with high probability outputs a high-accuracy hypothesis 
with respect to V must make at least ri(-) many queries. Such membership queries can simulate QEX{c,'D) queries 
since the support of V is S, and thus this gives a lower bound on the sample complexity of quantum PAC learning 
with a QEX oracle. 



C. Improved Lower Bounds on Quantum Sample Complexity of PAC Learning 

In this section we give improved lower bounds on the sample complexity of quantum (e, <5)-PAC learning algorithms 
for concept classes C of VC dimension d. These new bounds nearly match the classical lower bounds of [ill • 
We first note that the ^{^) lower bound of can be easily strengthened to ri(d): 

Observation VI. 3 Let C be any concept class of VC dimension d and let T) be the uniform distribution over a 
shattered set S of size d. Then any quantum learning algorithm which (i) can make quantum membership queries on 
any superposition of inputs in the domain S, and (ii) with high probability outputs a hypothesis with error rate at 
most e = , must make at least queries ( and consequently the sample complexity of PA C learning C with a QEX 
oracle is Cl{d)). 

Recall that in the exact learning model, the concept class C of all parity functions over n Boolean variables has 
VC dimension d = nyet can be exactly learned with one call to a quantum membership oracle using the Bcrnstein- 
Vazirani algorithm In light of this, we feel that this improvement from ^(f^) to fl{d) is somewhat unexpected, 
and may even at first appear contradictory. The key to the apparent contradiction is that the Bernstein- Vazirani 
algorithm makes its membership query on a superposition of all 2" inputs in {0, 1}", not just the n inputs in a fixed 
shattered set S. 

Proof: It suffices to slightly sharpen the proof of Theorem 4.2 from |2^. The key observation is that since queries 
always have zero amplitude on computational basis states outside of the shattered set S, the effective value of the 
domain size N is \S\ = d rather than |{0, 1}"| — 2". With this modification, at the end of the proof of Theorem 4.2 
we obtain the inequality Nq ~ X]i=o it) — ^''^^ where T is the quantum query complexity of the algorithm (instead 
of the inequality ^J^g {^^) > 2'^/6 which appears in |20j). Now standard tail bounds on binomial coefficients (see e.g. 
Appendix 9 of 19]) show that T > ■ 

We now give lower bounds on the quantum query complexity of (e,(5)-PAC learning which depend on e and 6. We 
require the following definition and fact: 

Definition VI. 4 A concept class C is said to be trivial if either C contains only one concept, or C contains exactly 
two concepts Co,Ci with each x € {0,1}" belonging to exactly one ofco,Ci. 

Fact VI. 5 (See |2l|^ Let |'(/'^°-'), IV'^^'') represent states of a quantum system such that for some measurement II we 
have (V'(°)|n|V'(°)) > 1-6 and (V'(^) |n|V'(^)) < S for some 5 > 0. Then we have Kt/.^") j^/'^^^) | < 2y^5{l ~ 5). 

It is clear that a trivial concept class can be learned exactly from any single (classical) example. For nontrivial 
concept classes jQ] gave a classical sample complexity lower bound of ri(-^ log |). We now extend this bound to the 
quantum setting: 

Lemma VI. 6 Any quantum algorithm with a QEX{c,'D) oracle which {e, 6)-learns a non-trivial concept class must 
have quantum sample complexity f2(Mog|). 
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Proof: Since C is non-trivial, without loss of generality we may assume that there are two inputs , and two 
concepts cq, Ci £ C such that Co(a:o) — ci(xo) = while Co(a:i) = 0, Ci(xi) = 1. 

Let T) be the distribution where I?(xo) = 1 — 3e and I?(a;i) — 3e. Under this distribution, no hypothesis which is 
e-accurate for cq can be e-accurate for ci and vice versa. 

Let IV't^) be the state of the system immediately after the T queries of QEX{ci,'D) are performed. Then we have 

(V'rVo,0,a;o,0,...,a;o,0,0...,0> = (1 - 3e)'^/^ fori = 0,1. 

repeated T times 

It is easy to see that any other computational basis state | . . . , xi, 5, . . . ) which has nonzero amplitude in \'4't^) must 
have zero amplitude in the other possible state \^^^ ''•*), because cq and ci disagree on xi. Consequently we have 
(^(o)|^(i)^ = (1 - 3e)^. If (1 - 3e)'^ > 2^(5(1 - 6) then Fact IVLSl dictates there is some output hypothesis which 
occurs with probability greater than 5 whether the target is cq or ci; but this cannot be the case for an (e,(5)-PAC 
learning algorithm. Thus we must have (1 - Se)^^ < AS yielding T = log i). ■ 

Ehrenfeucht et al. obtained a ri(-) lower bound for classical PAC learning by considering a distribution V 
which distributes 0(e) weight evenly over all but one of the elements in a shattered set. In other words under T) one 
element in the shattered set has weight 1 — 0(e) and all the remaining d—1 elements has equal weight ^irf. We use 
such a distribution to obtain the following quantum lower bound (no attempt has been made to optimize constants): 

Theorem VI. 7 Let C he any concept class of VC dimension d-l- 1. Let 6 = 1/5. Then we have that for sufficiently 
large d (i.e. d> 625 suffices) and any < e < any quantum algorithm with a QEX{c,'D) oracle which (e, 5)-learns 

C must have quantum sample complexity at least j^poOTe • 

Proof: Let {xq, xi, . . . , Xd] be a set of inputs which is shattered by C. We consider the distribution P, first introduced 
by which has I?(a;o) = 1 — 8e and V^Xi) = ^ for i = 1, . . . , d. Let H{x) = —x\ogx — {1 — x) log(l — x) denote 
the binary entropy function. As is noted in "2^, there exists a set s^ , . . . ,s^ of d-bit strings such that for all i ^ j the 
strings and s^ differ in at least d/4 positions where A > 2'''^^~-^(^/'*^^ > 2'*/^. For each i = 1, . . . , A let Ci be a concept 
such that (i) Ci{xo) = 0, and (ii) the d-bit string (ci{xi), . . . , Ci{xd)) is s*. The existence of such concepts follows from 
Definition IVI.ll Since we have e < our quantum PAC learning algorithm should successfully distinguish between 
any two target concepts Ci and Cj with confidence at least 1 — S = |. Moreover, without loss of generality we may 
suppose that e < ^^^^^ since otherwise Observation IVI.31 viclds the required lower bound. 
We shall make use of the following standard inequality: 

{l-xf >l-xt,iixi<l for teZ+,xe M+. (1) 

Given a target concept c, we write |Cii,i2,...,jt) to denote the basis state 

l^ii ,12 ) ~ c(Xij^, Xi2, ciyXi.^) , . . . Xi^, ci^Xi^)) . 

We define the state |(^t) to be 

= (1 - Sef'^\io^o,...fi) + (1 - Se)"^ y ^ $^(IC.,o,o,...,o) + ICo,»,o,...,o) + • • • + 1^0,0,...,^)) + 

i—l 

Here \z) is some canonical basis state which is distinct from, and hence orthogonal to, all states of the form |^ii,i2,...,it)i 
e.g. we could take z = jxi, c(a;i), Xi, 1 — c(a;i), 0, 0, . . . , 0). The scalar a is a suitable normalizing coefficient so that 
the Euclidean norm of \4>t) is 1. 

Let j-^t) denote the state of the quantum register after t invocations of QEX{c,'D) have occured. It is easy to see 
from the definition of the QEX{c, V) oracle that the amplitude of \tpt) on the basis state |^o,...,o) will be (1 — 8e)*/^, and 
that for each of the td many basis states |fo,o,...,o,i,o,...,o) (where i ranges over all t positions and ranges in value from 1 

to d) the amplitude of \tpt) on this basis state will be (1 — 8e)^ We thus have that {ipt\4't) = (1 — 8e)* (^l + 

If we let t = ^qq]^^ (note that t > 1 hy our assumption that e < ^5^773)' then have that (1 — 8e)* > (1 — Y^Td^ 
by 1^, and it is easy to check that (1 + j^fj) > 1 + -[2^ from the bounds on e and d in the theorem statement. We 
thus have that 

(^d0*)>l-^. (2) 
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Now let us consider what happens if we replace each successive block of t 
oracle in our PAC learning algorithm with the transformation 



1 



invocations of the QEX{c,'D) 



Q : |0*("+i)) ^ 



If the learning algorithm makes a total of T calls to QEX{c, T>) then we perform T/t replacements. After all T/t calls 
to Q in the modified algorithm, the initial state |0 . . .0) evolves into the following state \(f): 



By Equation we have that (0t|</3) > (1 - iSd)^^*- ^ '^/lOO (i.e. if T < jg^), then by ^ this lower 

bound is at least |||, and this implies (since the original algorithm with T many QEX{c^T>) calls was successful on 
each target d with probability at least 4/5) that the modified algorithm which makes at most d/lOO many calls to 
Q is successful with probability at least 2/3. However, the exact same polynomial-based argument which underlies 
the n{d/n) lower bound for PAC learning proved in |23| (and the improved lower bound of Observation I VI. 3(1 
implies that it is impossible for our modified algorithm, which makes at most d/100 many calls to Q, to succeed on 
each target Ci with probability at least 2/3. (The crux of that proof is that each invocation of a black-box oracle for c 
increases the degree of the polynomial associated to the coefficient of each basis state by at most one. This property 
is easily seen to hold for Q as well - after r queries to the Q oracle, the coefficient of each basis state can be expressed 
as a degree-r polynomial in the indeterminates c(xi), . . . , c{xd).) This proves that we must have T/t > d/lOO, which 
gives the conclusion of the theorem. ■ 

Combining our results, we obtain the following quantum version of the classical il(i log + 7) bound: 

Theorem VI. 8 Any quantum {e, 6)-PAC learning algorithm for a concept class of VC dimension d must make at 
least il(i log i + d + ^) calls to the QEX oracle. 



Several natural questions for future work suggest themselves. For the quantum exact learning model, is it possible 
to get rid of the log log |C| factor in our algorithm's upper bound and thus prove the conjecture of Hunziker et al. [iTj 
exactly? For the partitions problem, can we extend the range of partition sizes (as a function of |C|) for which there 
can be a superpolynomial separation between the quantum and classical query complexity of learning the partition? 
Finally, for the PAC learning model, a natural goal is to strengthen our lower bound on sample complexity to 

^{^) and thus match the lower bound of ^ for classical PAC learning. 
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